[codex] Apply generated cpflow GitHub Actions flow#732
[codex] Apply generated cpflow GitHub Actions flow#732
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Upstream token not passed to copy-image command
- Added the required -t upstream token argument to the cpflow copy-image-from-upstream command while keeping the token sourced from secrets via the environment.
Or push these changes by commenting:
@cursor push 3606e16506
Preview (3606e16506)
diff --git a/.github/workflows/cpflow-promote-staging-to-production.yml b/.github/workflows/cpflow-promote-staging-to-production.yml
--- a/.github/workflows/cpflow-promote-staging-to-production.yml
+++ b/.github/workflows/cpflow-promote-staging-to-production.yml
@@ -203,14 +203,13 @@
- name: Copy image from staging
env:
- # Pass the upstream token via env rather than `-t` so it doesn't appear in /proc/<pid>/cmdline.
CPLN_UPSTREAM_TOKEN: ${{ secrets.CPLN_TOKEN_STAGING }}
PRODUCTION_APP_NAME: ${{ vars.PRODUCTION_APP_NAME }}
CPLN_ORG_PRODUCTION: ${{ vars.CPLN_ORG_PRODUCTION }}
shell: bash
run: |
set -euo pipefail
- cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}"
+ cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" -t "${CPLN_UPSTREAM_TOKEN}" --org "${CPLN_ORG_PRODUCTION}"
- name: Deploy image to production
env:You can send follow-ups to the cloud agent here.
Code Review: cpflow GitHub Actions MigrationThis is a solid migration from hand-written to generated Security — meaningful fixes
Bugs fixed
Potential issues
Nits
Overall this is well-structured with meaningful security and reliability improvements. The one item I'd prioritize is the |
| if cpflow exists -a "${APP_NAME}" --org "${CPLN_ORG}"; then | ||
| echo "exists=true" >> "$GITHUB_OUTPUT" | ||
| else | ||
| echo "exists=false" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
This if/else around cpflow exists conflates two different failure modes: "app does not exist" (expected non-zero) and "API/auth error" (unexpected non-zero). Both land in the else branch and set exists=false.
Compare delete-app.sh in this same PR, which carefully pattern-matches the output ("Double check your org", "Unknown API token format", "ERROR", etc.) to distinguish a real failure from a genuine not-found and surface it as a hard error.
If there's a transient auth issue when a PR is pushed, this step sets exists=false and the pull_request path silently writes a step-summary saying the app doesn't exist yet — masking the real problem until the deploy is manually retried.
Suggested approach: mirror delete-app.sh's pattern — capture the output, check exit code, inspect the text for known error tokens, and exit non-zero on a real failure rather than treating it as "not found".
There was a problem hiding this comment.
Fixed in b9095c6 by capturing cpflow exists output and failing hard on auth/API error tokens instead of treating every non-zero exit as app-not-found.
|
|
||
| if [[ -f "${HOME}/.ssh/cpflow_build_key" ]]; then | ||
| eval "$(ssh-agent -s)" | ||
| trap 'ssh-agent -k >/dev/null; rm -f "${HOME}/.ssh/cpflow_build_key"' EXIT |
There was a problem hiding this comment.
The EXIT trap kills the SSH agent and removes cpflow_build_key, but the ~/.ssh/known_hosts file written in the previous step is not cleaned up here. On ephemeral GitHub-hosted runners this is harmless (the VM is discarded), but if this action is ever adopted on self-hosted runners with a persistent ~/.ssh, the custom known_hosts entries would accumulate across runs.
Consider adding rm -f "${HOME}/.ssh/known_hosts" to the trap, or using a temp file under $RUNNER_TEMP instead of ~/.ssh for the known_hosts.
| default: "3.3.1" | ||
| cpflow_version: | ||
| description: cpflow gem version | ||
| required: false |
There was a problem hiding this comment.
The CLI version is pinned here as a default, which is good for reproducibility. Worth noting: since sudo npm install -g fetches from the npm registry at runtime, a compromised release of @controlplane/cli@3.3.1 would silently affect all runs. Consider adding a SHA/integrity check (e.g. via npm install --integrity) or at minimum locking to a digest-pinned download, especially since this step runs in workflows that access staging/production tokens.
|
|
||
| current_image="${selected_image:-${first_image}}" | ||
| current_version="${selected_version:-${first_version}}" | ||
|
|
There was a problem hiding this comment.
The staging env-var parity check only compares names, not values. It will catch a missing DATABASE_URL in production, but it won't catch a case where production has the right key names but wrong or stale values (e.g. a staging secret was rotated but the production counterpart wasn't updated). This is expected behaviour for a name-only check — just worth documenting explicitly so it's clear what the check does and doesn't guarantee.
There was a problem hiding this comment.
Documented in b9095c6: the production promotion workflow checks that production has the same environment variable names as staging, but it does not compare secret values.
|
Code Review: Replace hand-written CI/CD Actions with generated cpflow- flow* Well-considered overhaul that hardens security, safety, and operational reliability. No blocking issues found. Security improvements Token no longer passed via --token flag — old setup-environment exposed it in /proc/cmdline and ps; now uses CPLN_TOKEN env var only. SSH key scoped to a dedicated step so ACTIONS_STEP_DEBUG cannot leak it from the build step. Delete command now gated on author_association (old workflow had no permission check). Fork PR protection blocks syncs before any secret is consumed. Old contains() command matching replaced with exact match. Safety improvements cancel-in-progress changed from true to false everywhere — old staging deploy risked partial GVC deployments mid-rollout. Prefix-based delete guard replaces the old substring check for production/staging names. Production rollback logic added with per-workload state capture and restore — old workflow had no rollback at all. Health checks now configurable (HEALTH_CHECK_RETRIES, HEALTH_CHECK_INTERVAL, HEALTH_CHECK_ACCEPTED_STATUSES). Issue 1: Fork-PR /deploy-review-app comment silently fails with no PR feedback When a collaborator comments /deploy-review-app on a fork PR, the job condition passes but Validate review app deployment source exits 1 with only a runner log message. No PR comment explains the rejection. Suggest posting an issues.createComment before exit 1, similar to how the unconfigured-secrets path writes to GITHUB_STEP_SUMMARY. Issue 2: cpflow exists error detection is string-matching fragile delete-app.sh uses case matching over stderr to distinguish app-not-found from auth/network errors. The inline TODO acknowledges this. If cpflow changes an error message, the pattern will not match and a real failure will be silently treated as app not found, causing delete to return success vacuously. The TODO to switch to a structured exit code is the right long-term fix. Issue 3: gem install / npm install without integrity pinning cpflow-setup-environment/action.yml installs @controlplane/cli@3.3.1 and cpflow 4.2.0 without hash verification. A re-tagged or compromised release would install silently. For production-adjacent CI, consider verifying SHA256 after install or using lockfiles with integrity entries. Issue 4: Nightly cleanup has no failure notification cpflow-cleanup-stale-review-apps.yml runs on cron with no failure notification. Silent failures accumulate stale apps over days. Consider a step gated on if: failure() that posts a GitHub issue or annotation. Minor observations APP_NAME / PR_NUMBER double-assignment is intentional (workflow-level env needed early for concurrency.group, step overwrites with resolved values) but a short comment would clarify intent. cpflow-review-app-help.yml using pull_request_target is correct and well-documented. deploy job in cpflow-deploy-staging.yml re-runs cpflow-setup-environment unnecessarily (only needs cpflow deploy-image) — minor redundancy, acceptable for simplicity. |
| echo "🗑️ Deleting application: $APP_NAME" | ||
| cpflow delete -a "$APP_NAME" --org "$CPLN_ORG" --yes | ||
|
|
||
| echo "✅ Successfully deleted application: $APP_NAME" |
There was a problem hiding this comment.
The case pattern matching over stderr is the right approach given cpflow exists doesn't expose a structured signal, and the inline TODO is the right way to track this. One additional defensive measure worth considering: log the full exists_output to the runner even when the "not found" path is taken, so there's an audit trail if unexpected output silently passed through the filter. Right now the output is printed only if non-empty, but it's easy to miss in logs.
Also worth noting: the pattern list doesn't include "timeout", "connection refused", or "ENOTFOUND" — all plausible network failure messages. If cpflow's Ruby HTTP client times out, it may surface Errno::ECONNREFUSED or Net::ReadTimeout, which would match "Net::", so that's covered. But curl-style timeouts would not. Low risk given this is a Ruby gem, but worth keeping in mind when bumping the cpflow version.
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| cpflow cleanup-stale-apps -a "${{ vars.REVIEW_APP_PREFIX }}" --org "${{ vars.CPLN_ORG_STAGING }}" --yes |
There was a problem hiding this comment.
No failure notification exists for this cron job. If cpflow cleanup-stale-apps fails (auth error, API hiccup, etc.), stale review apps accumulate silently across days.
Consider appending a step:
- name: Notify on failure
if: failure()
run: |
echo "::error::Stale review app cleanup failed. Check the workflow logs."Or, if the team monitors GitHub issues, use gh issue create to open a tracking issue on failure so it's not lost in workflow history.
| } >> "$GITHUB_STEP_SUMMARY" | ||
| exit 0 | ||
| fi | ||
|
|
There was a problem hiding this comment.
When a collaborator comments /deploy-review-app on a fork PR, execution reaches this exit 1 with only a runner log message — no feedback is posted to the PR thread. The collaborator sees a red workflow run with no explanation in the PR.
Consider posting a comment before exiting. The GH_TOKEN is available from the Resolve PR ref and commit step but not inherited here; you'd need to add GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} to this step's env block. Something like:
gh api "repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" \
--method POST \
--field body="Fork PRs cannot use /deploy-review-app because this workflow uses repository secrets. Ask a maintainer to deploy from a branch in the base repository." \
--silent || true
There was a problem hiding this comment.
Fixed in 1e46bfe: fork PR /deploy-review-app rejections now post a PR comment before exiting, while still keeping the deployment blocked because the workflow uses repository secrets.
| run: | | ||
| set -euo pipefail | ||
|
|
||
| sudo npm install -g @controlplane/cli@${{ inputs.cpln_cli_version }} |
There was a problem hiding this comment.
Version-pinned but not hash-verified. A malicious actor who gains write access to the @controlplane/cli npm package (or re-tags a release) could substitute a compromised binary. Same applies to gem install cpflow below.
For production-adjacent pipelines, consider adding a post-install checksum check, or at minimum document the expected SHA256 for each pinned version in a comment so drift is detectable during future version bumps.
| run: | | ||
| # Best-effort rollback: try every workload, aggregate failures, exit non-zero at the end | ||
| # if any failed. A single cpln hiccup shouldn't leave other workloads mid-promotion. | ||
| set -uo pipefail |
There was a problem hiding this comment.
The intentional omission of -e here (using set -uo pipefail instead of set -euo pipefail) is the correct design — it lets rollback continue across all workloads even if one cpln call fails, with rollback_failures tracking the aggregate. Worth a brief comment explaining the omission so a future reader doesn't add -e thinking it was forgotten:
| set -uo pipefail | |
| # Intentionally no -e: we want every workload to attempt rollback even if earlier ones fail. | |
| set -uo pipefail |
There was a problem hiding this comment.
Fixed in 1e46bfe with a comment explaining that set -e is intentionally omitted so rollback can attempt every workload and aggregate failures.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Token not persisted for subsequent workflow steps
- The composite action now writes CPLN_TOKEN to GITHUB_ENV so subsequent cpflow and cpln workflow steps inherit authentication without passing the token on the command line.
Or push these changes by commenting:
@cursor push f1eee564c7
Preview (f1eee564c7)
diff --git a/.github/actions/cpflow-setup-environment/action.yml b/.github/actions/cpflow-setup-environment/action.yml
--- a/.github/actions/cpflow-setup-environment/action.yml
+++ b/.github/actions/cpflow-setup-environment/action.yml
@@ -63,6 +63,12 @@
exit 1
fi
+ {
+ echo "CPLN_TOKEN<<__CPLN_TOKEN__"
+ echo "$CPLN_TOKEN"
+ echo "__CPLN_TOKEN__"
+ } >> "$GITHUB_ENV"
+
create_output=""
if ! create_output="$(cpln profile create default --org "$ORG" 2>&1)"; then
if ! echo "$create_output" | grep -qi "already exists"; thenYou can send follow-ups to the cloud agent here.
Code Review — PR #732: Generated cpflow GitHub Actions FlowOverviewThis is a well-structured replacement of hand-written CI/CD workflows with the generated Significant Improvements ✅
IssuesBug:
|
| fi | ||
| deploy_args+=(--org "${CPLN_ORG}" --verbose) | ||
|
|
||
| cpflow deploy-image "${deploy_args[@]}" |
There was a problem hiding this comment.
Bug: deploy runs even when build fails.
When a job has an explicit if:, GitHub Actions drops the implicit success() guard. Because this condition only checks validate-branch's output, the deploy job will start even if the build job failed, wasting runner minutes and attempting to deploy a missing image.
| cpflow deploy-image "${deploy_args[@]}" | |
| if: needs.validate-branch.outputs.is_deployable == 'true' && needs.build.result == 'success' |
| echo "🗑️ Deleting application: $APP_NAME" | ||
| cpflow delete -a "$APP_NAME" --org "$CPLN_ORG" --yes | ||
|
|
||
| echo "✅ Successfully deleted application: $APP_NAME" |
There was a problem hiding this comment.
The string-matching heuristic on cpflow exists error output (lines 43–54) is the main reliability risk in this file. The consequence of an unmatched error pattern is silently exiting 0 ("app not found") when the API actually returned an error, meaning the delete is skipped without any visible failure.
An alternative that inverts the logic: treat any non-zero exit with non-empty output as a failure unless it matches a known "not found" pattern:
# instead of matching known error strings, match known "not found" patterns
case "$exists_output" in
*"not found"*|*"does not exist"*)
echo "⚠️ Application does not exist: $APP_NAME"
exit 0
;;
*)
echo "❌ ERROR: failed to determine whether application exists" >&2
printf '%s\n' "$exists_output" >&2
exit 1
;;
esacThis is safer by default — unknown output is treated as an error rather than a no-op. The tradeoff is that you'd need to know the exact "not found" wording from cpflow exists, which has the same coupling problem in the other direction. The TODO comment already captures this; just flagging it for awareness.
| echo "ready=false" >> "$GITHUB_OUTPUT" | ||
| { | ||
| echo "Control Plane review app automation is not configured yet." | ||
| echo |
There was a problem hiding this comment.
This top-level APP_NAME is computed from the raw event payload and is later overwritten in $GITHUB_ENV by the Resolve PR ref and commit step (echo "APP_NAME=..." >> "$GITHUB_ENV"). The step-level write wins, so this declaration is dead configuration and can be confusing. Consider removing it and relying solely on the step-computed value, which is already the authoritative one (it handles all three event types correctly).
Code Review: cpflow Generated GitHub Actions FlowThis is a meaningful security and reliability upgrade over the handwritten workflows. Several issues from the old setup are fixed here; I've also flagged a few new items worth addressing. ✅ Notable Improvements
🔴 Issues1. The delete workflow triggers on 2. The same fragile string-matching logic ( 🟡 Warnings3. In 4. Production health check accepts 301/302 by default
5. Nightly cleanup has no failure notification
ℹ️ Informational6.
7. These will pick up any non-breaking updates automatically, which is generally fine. However, |
| name: Delete Review App | ||
|
|
||
| on: | ||
| pull_request_target: |
There was a problem hiding this comment.
pull_request_target deserves an explanation comment here.
pull_request_target runs in the context of the base repo (write permissions, access to secrets) and is required for fork PR cleanup — pull_request can't post comments on fork PRs because the token is read-only. This is safe here because no PR code is ever checked out (the actions/checkout@v4 below fetches base-branch code only).
cpflow-review-app-help.yml has a good comment explaining this exact pattern. Without a parallel comment here, a future maintainer may "fix" this to pull_request thinking it's safer, silently breaking fork PR cleanup.
| pull_request_target: | |
| # pull_request_target is intentional: it has write permission to comment on PRs from | |
| # forks, where `pull_request` would be read-only. This is safe because no PR code is | |
| # checked out — the job only calls cpflow with base-repo credentials. | |
| # Do not switch this to `pull_request` or add a checkout of the PR ref without re-evaluating. | |
| pull_request_target: |
| # Make the token available to later workflow steps without putting it on argv. | ||
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | ||
| { | ||
| echo "CPLN_TOKEN<<${token_delimiter}" | ||
| printf '%s\n' "$CPLN_TOKEN" | ||
| echo "${token_delimiter}" | ||
| } >> "$GITHUB_ENV" |
There was a problem hiding this comment.
Writing CPLN_TOKEN to GITHUB_ENV propagates it to every subsequent step in the calling workflow — not just the steps that explicitly use it. This is a necessary tradeoff (cpflow picks it up from the environment automatically), but it's worth documenting so future maintainers don't inadvertently add steps that log environment state.
| # Make the token available to later workflow steps without putting it on argv. | |
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | |
| { | |
| echo "CPLN_TOKEN<<${token_delimiter}" | |
| printf '%s\n' "$CPLN_TOKEN" | |
| echo "${token_delimiter}" | |
| } >> "$GITHUB_ENV" | |
| # Persist CPLN_TOKEN to GITHUB_ENV so all subsequent steps can authenticate | |
| # with cpflow/cpln without putting the token on the command line. | |
| # Trade-off: the token is present in every subsequent step's environment. | |
| # Do not add steps that dump env vars (e.g., `env | sort`) after this point. | |
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | |
| { | |
| echo "CPLN_TOKEN<<${token_delimiter}" | |
| printf '%s\n' "$CPLN_TOKEN" | |
| echo "${token_delimiter}" | |
| } >> "$GITHUB_ENV" |
| # TODO: replace this string-matching with a structured signal once `cpflow exists` exposes one | ||
| # (e.g. a distinct exit code for "not found" vs. API/auth errors, or `cpflow exists --json`). | ||
| # Until then, keep this list in sync if `cpflow exists` starts emitting new error patterns — | ||
| # any unmatched error string would otherwise be silently treated as "app not found". | ||
| exists_output="" | ||
| if ! exists_output="$(cpflow exists -a "$APP_NAME" --org "$CPLN_ORG" 2>&1)"; then | ||
| case "$exists_output" in | ||
| *"Double check your org"*|*"Unknown API token format"*|*"ERROR"*|*"Error:"*|*"Traceback"*|*"Net::"*) |
There was a problem hiding this comment.
The same cpflow exists error-pattern matching appears verbatim in the Check if review app exists step in cpflow-deploy-review-app.yml (line ~195). The TODO comment here is accurate but only covers this file — the workflow copy is uncovered.
If cpflow adds a new error pattern (e.g. "rate limit" or "timeout"), both locations need updating in sync. Consider extracting this into a shared script (e.g. .github/scripts/cpflow-app-exists.sh) that both the action and the workflow call, so the pattern list is maintained in one place.
| # repo variable to tighten this for apps that expose a dedicated health endpoint | ||
| # (e.g. "200" for a plain /health, or "200 401 403" for apps that auth-gate / without | ||
| # redirecting). | ||
| HEALTH_CHECK_ACCEPTED_STATUSES: ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }} |
There was a problem hiding this comment.
Accepting 301/302 as "healthy" means a redirect to a login page, maintenance page, or even a misconfigured CDN passes the health check. The comment correctly explains why (curl without -L, Rails auth-gating /), but a redirect can pass even when the destination is broken.
Consider adding a step summary warning when a non-200 status is accepted, so operators reviewing the promotion log have visibility:
if [[ "${http_status}" == "${accepted}" ]]; then
if [[ "${http_status}" != "200" ]]; then
echo "::warning::Health check passed with HTTP ${http_status} (not 200) — verify the redirect destination is healthy."
fi
echo "healthy=true" >> "$GITHUB_OUTPUT"
exit 0
fiAlso worth documenting in .controlplane/shakacode-team.md that setting HEALTH_CHECK_ACCEPTED_STATUSES=200 and adding a /health endpoint is the recommended production posture.
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| - cron: "0 0 * * *" |
There was a problem hiding this comment.
Scheduled workflows that fail silently are easy to miss — GitHub does send an email for failed scheduled runs but only to the repo owner by default, and only after 7 days of consecutive failures (as of current GitHub behavior).
Consider adding a failure notification step, or at minimum documenting that the team should verify GitHub's "Notify me of failed workflow runs" notification setting is enabled. Stale review apps that aren't cleaned up accrue Control Plane costs.
| - cron: "0 0 * * *" | |
| - cron: "0 3 * * *" # 3 AM UTC — avoids exact midnight when many repos run scheduled jobs simultaneously |
(The cron timing is a minor suggestion — staggering off midnight reduces GitHub's scheduler contention, making the run more likely to start on time.)
| # Override per-repo by setting `CPLN_CLI_VERSION` / `CPFLOW_VERSION` repo variables; | ||
| # an empty input falls back to the action's pinned default below. | ||
| default_cpln_cli_version="3.3.1" | ||
| default_cpflow_version="5.0.0.rc.0" |
There was a problem hiding this comment.
RC version as production default — 5.0.0.rc.0 is a release candidate. Using an RC as the fallback version in a generated template that ships into production CI pipelines is risky: RCs may contain breaking changes or regressions that won't be caught until a live deploy fails.
Before merging, please either:
- Replace this with the stable release once
5.0.0is published, or - Hold the PR until the stable release is available and pin to that.
The same RC version is also hardcoded in the bootstrap step of cpflow-deploy-review-app.yml (see comment there).
| uses: ruby/setup-ruby@v1 | ||
| with: | ||
| ruby-version: "3.4" | ||
|
|
There was a problem hiding this comment.
Version string duplicated across two files — This bootstrap step hardcodes "5.0.0.rc.0" independently of default_cpflow_version in cpflow-setup-environment/action.yml. If one is bumped without the other, the bootstrap step will install a different cpflow version than the rest of the workflow, which could silently generate different action files.
Consider extracting the version to a single source of truth (e.g., a step output or an env var read from one file), or at minimum add a comment pointing to the companion definition in cpflow-setup-environment.
| for attempt in $(seq 1 "${CPFLOW_MAX_RETRIES}"); do | ||
| echo "Health check attempt ${attempt}/${CPFLOW_MAX_RETRIES}" | ||
|
|
||
| workload_stderr="$(mktemp)" |
There was a problem hiding this comment.
Temp files created per loop iteration without a trap — mktemp is called on every retry iteration; each file is cleaned up within its iteration (rm -f "${workload_stderr}" on both success and failure paths), so the happy path is fine. However, if the runner is killed or the script is interrupted between mktemp and the paired rm -f, the temp file leaks.
A trap registered before the loop is the standard guard:
| workload_stderr="$(mktemp)" | |
| workload_stderr="$(mktemp)" | |
| trap 'rm -f "${workload_stderr}"' EXIT |
Or, restructure to create one temp file before the loop and reuse it across iterations.
| # deploy branches unless `cpflow generate-github-actions --staging-branch BRANCH` | ||
| # was used. If STAGING_APP_BRANCH is later changed in repository variables, keep | ||
| # this list in sync so pushes to that branch actually trigger the workflow. | ||
| branches: ["master"] |
There was a problem hiding this comment.
Push trigger can silently diverge from STAGING_APP_BRANCH variable — The on.push.branches filter is hardcoded to ["master"] (noted in the comment above). If STAGING_APP_BRANCH is later changed in repository variables to a different branch, pushes to that new branch won't trigger this workflow at all — they'll silently skip deployment with no error.
The existing comment documents the limitation well, but a runtime guard in the validate-branch job would make the misconfiguration observable. Consider adding a ::warning:: or step summary entry when GITHUB_REF_NAME matches STAGING_APP_BRANCH but doesn't match any of the hardcoded trigger branches, so operators know to update the branches: filter too.
Code Review: cpflow Generated GitHub Actions MigrationOverviewThis PR replaces hand-written Control Plane GitHub Actions with a generated Security Improvements ✅
Reliability Improvements ✅
Issues to Address1. Release Candidate as Default Version (Medium Risk — see inline comment)
2.
|
Code Review: Replace hand-written CI/CD with generated
|
| CPLN_UPSTREAM_TOKEN: ${{ secrets.CPLN_TOKEN_STAGING }} | ||
| PRODUCTION_APP_NAME: ${{ vars.PRODUCTION_APP_NAME }} | ||
| CPLN_ORG_PRODUCTION: ${{ vars.CPLN_ORG_PRODUCTION }} | ||
| STAGING_IMAGE: ${{ steps.staging-image.outputs.image }} | ||
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}" --image "${STAGING_IMAGE}" |
There was a problem hiding this comment.
Bug: CPLN_UPSTREAM_TOKEN is exported but never consumed by the command.
cpflow copy-image-from-upstream doesn't read CPLN_UPSTREAM_TOKEN from the environment automatically — it expects the upstream token via --upstream-token (or equivalent flag). As written, the command authenticates only with the production profile and will fail when staging and production use separate service accounts.
The comment says "pass via env rather than -t" to avoid leaking the value into /proc/<pid>/cmdline, which is the right instinct — just reference the env var as an argument value:
| CPLN_UPSTREAM_TOKEN: ${{ secrets.CPLN_TOKEN_STAGING }} | |
| PRODUCTION_APP_NAME: ${{ vars.PRODUCTION_APP_NAME }} | |
| CPLN_ORG_PRODUCTION: ${{ vars.CPLN_ORG_PRODUCTION }} | |
| STAGING_IMAGE: ${{ steps.staging-image.outputs.image }} | |
| shell: bash | |
| run: | | |
| set -euo pipefail | |
| cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}" --image "${STAGING_IMAGE}" | |
| cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" \ | |
| --upstream-token "${CPLN_UPSTREAM_TOKEN}" \ | |
| --org "${CPLN_ORG_PRODUCTION}" --image "${STAGING_IMAGE}" |
This keeps the token out of the command line (it's still sourced from the env var, not a literal) while actually passing it to the tool.
| # Override per-repo by setting `CPLN_CLI_VERSION` / `CPFLOW_VERSION` repo variables; | ||
| # an empty input falls back to the action's pinned default below. | ||
| default_cpln_cli_version="3.3.1" | ||
| default_cpflow_version="5.0.0.rc.0" |
There was a problem hiding this comment.
Release candidate as a production default is risky.
5.0.0.rc.0 will be the fallback for every repo that doesn't explicitly set CPFLOW_VERSION. Pre-release gems can have breaking changes or undocumented behaviour. Please update this to the latest stable cpflow release before merging, or — if 5.x isn't GA — pin to the latest 4.x stable here and upgrade the default once 5.x ships.
| default_cpflow_version="5.0.0.rc.0" | |
| default_cpflow_version="4.1.1" |
(replace with the actual latest stable tag)
| gem install cpflow -v "5.0.0.rc.0" --no-document | ||
| ruby -S cpflow generate-github-actions --staging-branch master | ||
| # shellcheck disable=SC2016 | ||
| ruby -0pi -e '$_.gsub!(/so callers can pass `\$\{\{ vars\.CPLN_CLI_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally."); $_.gsub!(/so callers can pass `\$\{\{ vars\.CPFLOW_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally.")' .github/actions/cpflow-setup-environment/action.yml |
There was a problem hiding this comment.
Fragile in-place string patch on a generated file.
This ruby -0pi -e 'gsub!(...)' relies on exact string literals from the upstream generator template. If the generator's wording ever changes, the substitutions silently no-op and the committed action.yml will contain raw ${{ vars.CPLN_CLI_VERSION }} / ${{ vars.CPFLOW_VERSION }} interpolation expressions, which GitHub Actions will evaluate at runtime (potentially emitting empty strings or variable reference errors).
Since the cpflow-* actions already exist in this repo, the entire bootstrap block (Set up Ruby for cpflow bootstrap + Bootstrap generated cpflow actions) is effectively dead code on every normal run — these steps only fire when the action file is missing. Consider removing the bootstrap mechanism and relying on the normal regeneration workflow documented in readme.md instead.
Code Review: Replace hand-written Control Plane Actions with generated
|
| uses: actions/github-script@v7 | ||
| with: | ||
| script: | | ||
| const commentId = Number("${{ steps.create-comment.outputs.comment-id }}"); |
There was a problem hiding this comment.
[Medium] Expression injection via external API value
workload_url is sourced from the Control Plane API (jq -r '.status.endpoint // empty') and interpolated directly into a JavaScript string literal. A double-quote in the URL breaks the JS context; a more malicious payload from a compromised upstream could execute arbitrary JavaScript in the github-script runner.
The PR already uses env-variable routing consistently in run: steps for this exact reason — the same pattern applies to github-script:
| const commentId = Number("${{ steps.create-comment.outputs.comment-id }}"); | |
| with: | |
| env: | |
| DEPLOY_COMMENT_ID: ${{ steps.create-comment.outputs.comment-id }} | |
| DEPLOY_DEPLOYMENT_ID: ${{ steps.init-deployment.outputs.result }} | |
| DEPLOY_APP_URL: ${{ steps.workload.outputs.workload_url }} | |
| DEPLOY_JOB_STATUS: ${{ job.status }} | |
| script: | | |
| const commentId = Number(process.env.DEPLOY_COMMENT_ID); | |
| const deploymentId = process.env.DEPLOY_DEPLOYMENT_ID; | |
| const appUrl = process.env.DEPLOY_APP_URL; | |
| const success = process.env.DEPLOY_JOB_STATUS === "success"; |
| - name: Update PR comment with build status | ||
| if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success') | ||
| uses: actions/github-script@v7 | ||
| with: |
There was a problem hiding this comment.
[Minor] Step output interpolated into JS string — inconsistent with env-routing pattern
The comment ID is a GitHub-controlled integer so there's no real injection risk here, but this is inconsistent with the env-variable routing pattern the rest of the PR applies to run: steps. The same issue recurs in the other Update PR comment steps (lines ~363). Prefer:
| with: | |
| const commentId = Number(process.env.CREATE_COMMENT_ID); |
with env: { CREATE_COMMENT_ID: "${{ steps.create-comment.outputs.comment-id }}" } on the step.
| fi | ||
|
|
||
| - name: Checkout repository | ||
| if: steps.check-branch.outputs.is_deployable == 'true' |
There was a problem hiding this comment.
[Minor] Missing persist-credentials: false
Every other checkout in this PR uses persist-credentials: false, but this one does not. The GITHUB_TOKEN credential helper is left in .git/config for the lifetime of the runner, which is inconsistent with the stated security intent and can persist on self-hosted runners.
| if: steps.check-branch.outputs.is_deployable == 'true' | |
| uses: actions/checkout@v4 | |
| with: | |
| persist-credentials: false |
| end | ||
|
|
||
| unless app_config.is_a?(Hash) | ||
| warn "Error: app '#{app_name}' is not defined under `apps:` in `.controlplane/controlplane.yml`." |
There was a problem hiding this comment.
[Minor] Hard exit on unrecognised app name will break review-app deployments
If the review app name (e.g. myapp-pr-42) matches neither an exact key in controlplane.yml nor a match_if_app_name_starts_with prefix, this exits 1 and fails the deploy. The intent of this action is to answer "does this app have a release script?". "App not found" is functionally the same as "no release script" — the deploy should proceed without --run-release-phase rather than abort.
Consider changing the not-found branch to:
# App not in config — assume no release script rather than failing the deploy
puts ""and remove the exit 1.
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| gem install cpflow -v "5.0.0.rc.0" --no-document |
There was a problem hiding this comment.
[Minor] Bootstrap installs an RC gem from RubyGems at workflow runtime
If the composite action files are absent from the default branch (e.g. after a force-push or accidental deletion), this fallback installs cpflow 5.0.0.rc.0 — an RC gem that can be yanked — and then regenerates the action files into the live checkout. Two concerns:
- Availability: a yanked RC gem will break every review-app deploy until the version in this file is updated.
- Trust boundary: the regenerated composite actions are written into the same workspace that subsequent steps load as trusted code. Confirm that
cpflow generate-github-actionscannot overwrite any file that is lateruses:-d by this workflow.
At a minimum, document that the pinned RC version here and in cpflow-setup-environment/action.yml must be kept in sync, and consider switching to a GA release once one is available.
| # immutable commit SHA. | ||
| steps: | ||
| - name: Set up Ruby | ||
| uses: ruby/setup-ruby@v1 |
There was a problem hiding this comment.
[Minor] Floating action version — not pinned to an immutable SHA
The existing comment explains the tradeoff for generated templates. For repos that opt in to supply-chain hardening, replace with the corresponding immutable SHA, e.g.:
uses: ruby/setup-ruby@354a1ad156761f5ee39be78be9a1e9ee7e4dad7e # v1.244.0
Dependabot with versioning-strategy: lockfile-only or Renovate in SHA-pinning mode will maintain these automatically.
|
This is a well-structured migration from hand-written workflows to generated cpflow-* GitHub Actions. The security thinking is thorough — fork gating, double-checkout pattern, persist-credentials: false, secrets via env vars, and a prefix-safety check on deletion are all done correctly. Security
Code Quality
Minor
Overall: The PR is a significant improvement — better fork safety, structured rollback, env-var parity checks before production promotion, and prefix-guarded deletion are all good additions. The issues above are mostly low-to-medium severity, but the persist-credentials inconsistency and RC version bootstrap are easy wins to clean up before merge. |
| script: | | ||
| const commentId = Number("${{ steps.create-comment.outputs.comment-id }}"); | ||
| const deploymentId = "${{ steps.init-deployment.outputs.result }}"; | ||
| const appUrl = "${{ steps.workload.outputs.workload_url }}"; |
There was a problem hiding this comment.
The workload_url is sourced from cpln workload get ... | jq -r '.status.endpoint // empty' — an external API value. If the endpoint string contains " (or a backtick in template-literal context), it will break this JS string literal and could be exploited if the Control Plane API response is ever tampered with.
All other dynamic values here (WORKFLOW_URL, CONSOLE_URL) are already routed through core.exportVariable and read as process.env.*. Apply the same pattern:
| const appUrl = "${{ steps.workload.outputs.workload_url }}"; | |
| const appUrl = process.env.APP_URL || ""; |
And add the export alongside the existing core.exportVariable calls in the "Set deployment links" step above:
core.exportVariable("APP_URL", workload_url); // in the "Retrieve app URL" shell step, or pass via a prior github-script step| if: ${{ hashFiles('.github/actions/cpflow-validate-config/action.yml') == '' }} | ||
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| gem install cpflow -v "5.0.0.rc.0" --no-document | ||
| ruby -S cpflow generate-github-actions --staging-branch master | ||
| # shellcheck disable=SC2016 | ||
| ruby -0pi -e '$_.gsub!(/so callers can pass `\$\{\{ vars\.CPLN_CLI_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally."); $_.gsub!(/so callers can pass `\$\{\{ vars\.CPFLOW_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally.")' .github/actions/cpflow-setup-environment/action.yml |
There was a problem hiding this comment.
Two concerns with this bootstrap block:
-
RC version in production CI:
5.0.0.rc.0is a pre-release. If the actions are already committed to the base branch (the normal case), this step is skipped — but for a fresh repo setup or if the actions are accidentally deleted, this RC gem installs and generates new action files mid-run. Using a GA release (or no version pin withgem install cpflow) would be safer. -
Inline patching of generated files: The
ruby -0pione-liner patchescpflow-setup-environment/action.ymlafter generation. If the upstream template text changes in a future cpflow release, this substitution silently becomes a no-op, and the literal${{ vars.CPLN_CLI_VERSION }}Markdown in the description won't be fixed. Consider either:- Committing the generated actions to the base branch so this bootstrap never runs in practice, or
- Adding an explicit check after the substitution (
grep -q "repository variable value" ... || exit 1) to fail loudly if the patch didn't apply.
| - name: Checkout repository | ||
| if: steps.check-branch.outputs.is_deployable == 'true' | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
This checkout is missing persist-credentials: false, which is present on every other actions/checkout call in this PR. This leaves the GITHUB_TOKEN credential helper stored in .git/config for the duration of the job, even though nothing in this job pushes to git.
| - name: Checkout repository | |
| if: steps.check-branch.outputs.is_deployable == 'true' | |
| uses: actions/checkout@v4 | |
| - name: Checkout repository | |
| if: steps.check-branch.outputs.is_deployable == 'true' | |
| uses: actions/checkout@v4 | |
| with: | |
| persist-credentials: false |
| fi | ||
|
|
||
| if [[ "${staging_image_ref}" == /org/*/image/* ]]; then | ||
| staging_image="${staging_image_ref##*/image/}" |
There was a problem hiding this comment.
The ## (greedy) operator strips the longest prefix matching */image/. For the expected Control Plane path format /org/myorg/image/myapp this is fine, but an image ref that contains a second /image/ segment (e.g. /org/myorg/image/myapp/image/v2) would be stripped to v2 instead of myapp/image/v2.
Using the non-greedy # strips only the first /image/ occurrence, which is more predictable:
| staging_image="${staging_image_ref##*/image/}" | |
| staging_image="${staging_image_ref#*/image/}" |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Health check variables ignore documented repository variable overrides
- Updated the workflow env defaults to read the matching repository variables for health check and rollback readiness settings.
Or push these changes by commenting:
@cursor push ed231dd54e
Preview (ed231dd54e)
diff --git a/.github/workflows/cpflow-promote-staging-to-production.yml b/.github/workflows/cpflow-promote-staging-to-production.yml
--- a/.github/workflows/cpflow-promote-staging-to-production.yml
+++ b/.github/workflows/cpflow-promote-staging-to-production.yml
@@ -16,8 +16,8 @@
# Worst-case wall time per attempt is HEALTH_CHECK_INTERVAL plus the curl --max-time below
# (10s), so the defaults give a ~10 minute window (24 × (15 + 10) = 600s) — enough for
# most Rails cold boots (asset precompile + db:migrate + workload readiness).
- HEALTH_CHECK_RETRIES: 24
- HEALTH_CHECK_INTERVAL: 15
+ HEALTH_CHECK_RETRIES: ${{ vars.HEALTH_CHECK_RETRIES || '24' }}
+ HEALTH_CHECK_INTERVAL: ${{ vars.HEALTH_CHECK_INTERVAL || '15' }}
# Space-separated list of HTTP statuses considered healthy. The default accepts 301/302
# because `curl` is invoked without `-L`, so a root `/` that redirects to a login page
# (common for Rails apps that auth-gate `/`) would otherwise be reported as unhealthy
@@ -31,8 +31,8 @@
# expose a dedicated health endpoint (e.g. "200" for a plain /health, or "200 401 403"
# for apps that auth-gate / without redirecting).
HEALTH_CHECK_ACCEPTED_STATUSES: ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }}
- ROLLBACK_READINESS_RETRIES: 24
- ROLLBACK_READINESS_INTERVAL: 15
+ ROLLBACK_READINESS_RETRIES: ${{ vars.ROLLBACK_READINESS_RETRIES || '24' }}
+ ROLLBACK_READINESS_INTERVAL: ${{ vars.ROLLBACK_READINESS_INTERVAL || '15' }}
PRIMARY_WORKLOAD: ${{ vars.PRIMARY_WORKLOAD }}
concurrency:You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit ff6a820. Configure here.
| # for apps that auth-gate / without redirecting). | ||
| HEALTH_CHECK_ACCEPTED_STATUSES: ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }} | ||
| ROLLBACK_READINESS_RETRIES: 24 | ||
| ROLLBACK_READINESS_INTERVAL: 15 |
There was a problem hiding this comment.
Health check variables ignore documented repository variable overrides
Medium Severity
HEALTH_CHECK_RETRIES, HEALTH_CHECK_INTERVAL, ROLLBACK_READINESS_RETRIES, and ROLLBACK_READINESS_INTERVAL are hardcoded in the workflow env block, so setting them as repository variables has no effect. This contradicts the workflow's own comment ("Override these by … setting the matching repository variable") and the documentation in cpflow-help.md and shakacode-team.md, which list all four as optional repository variables. HEALTH_CHECK_ACCEPTED_STATUSES on the same block correctly uses ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }}; these four need the same pattern.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit ff6a820. Configure here.
Code Review: Replace hand-written GitHub Actions with generated cpflow-* flowOverviewThis PR replaces five hand-written Control Plane GitHub Actions workflows/actions with generated Security Improvements (significant wins)
Issues and Concerns1. Release candidate in production default (medium risk)In A release candidate ships as the default for a production-critical deployment tool. If 2. Fragile bootstrap-and-patch step (maintainability concern)In 3.
|
| # Override per-repo by setting `CPLN_CLI_VERSION` / `CPFLOW_VERSION` repo variables; | ||
| # an empty input falls back to the action's pinned default below. | ||
| default_cpln_cli_version="3.3.1" | ||
| default_cpflow_version="5.0.0.rc.0" |
There was a problem hiding this comment.
This pins a release candidate as the default for a production-critical deployment tool. If 5.0.0.rc.0 contains a bug, every repo that inherits this default is affected. Please update to the stable release before merge (or document a concrete plan to do so before this workflow runs on master).
| - name: Bootstrap generated cpflow actions | ||
| if: ${{ hashFiles('.github/actions/cpflow-validate-config/action.yml') == '' }} | ||
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| gem install cpflow -v "5.0.0.rc.0" --no-document | ||
| ruby -S cpflow generate-github-actions --staging-branch master | ||
| # shellcheck disable=SC2016 | ||
| ruby -0pi -e '$_.gsub!(/so callers can pass `\$\{\{ vars\.CPLN_CLI_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally."); $_.gsub!(/so callers can pass `\$\{\{ vars\.CPFLOW_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally.")' .github/actions/cpflow-setup-environment/action.yml |
There was a problem hiding this comment.
This bootstrap block has two concerns:
-
It should never run for this repo:
cpflow-validate-config/action.ymlis committed here, sohashFiles(...)will always return a non-empty string. The block is dead code for this repo and only matters for repos that haven't yet committed the generated files. -
The
ruby -0pi -epatch is very fragile: It regex-substitutes specific hardcoded strings in the generatedcpflow-setup-environment/action.yml. If the upstream template changes that wording, the substitution silently no-ops and the file drifts undetected.
Recommend either removing the bootstrap entirely (since the generated files are committed) or replacing it with a CI drift-detection step that fails loudly if the committed generated files differ from what cpflow generate-github-actions would produce today.
| - name: Checkout repository | ||
| if: steps.check-branch.outputs.is_deployable == 'true' | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
The build and deploy jobs both set persist-credentials: false on their checkout steps, but this validate-branch checkout does not. For consistency and to keep GITHUB_TOKEN out of .git/config, add persist-credentials: false here too.
| - name: Checkout repository | |
| if: steps.check-branch.outputs.is_deployable == 'true' | |
| uses: actions/checkout@v4 | |
| - name: Checkout repository | |
| if: steps.check-branch.outputs.is_deployable == 'true' | |
| uses: actions/checkout@v4 | |
| with: | |
| persist-credentials: false |
| (github.event_name == 'issue_comment' && | ||
| github.event.issue.pull_request && | ||
| github.event.comment.body == '/deploy-review-app' && | ||
| contains(fromJson('["OWNER","MEMBER","COLLABORATOR"]'), github.event.comment.author_association)) |
There was a problem hiding this comment.
The allowlist covers OWNER, MEMBER, and COLLABORATOR, but not CONTRIBUTOR. GitHub assigns CONTRIBUTOR to anyone whose PR has previously been merged into the repo. Depending on the project's intent, CONTRIBUTOR may be a reasonable addition so past contributors can deploy review apps without needing explicit collaborator access. Worth confirming whether this is intentional.



Summary
cpflow-*action and workflow set fromcpflow generate-github-actions.cpflow-*action/workflow names.Verification
BUNDLE_GEMFILE=/tmp/cpflow-pr-278/Gemfile bin/conductor-exec bundle exec ruby /tmp/cpflow-pr-278/bin/cpflow github-flow-readiness— no blocking readiness issues detected.bin/conductor-exec ruby -e 'require "yaml"; paths = Dir[".github/**/*.yml"] + Dir[".github/**/*.yaml"] + Dir[".controlplane/**/*.yml"]; paths.sort.each { |path| YAML.load_file(path, aliases: true); puts path }'bin/conductor-exec git diff --cached --checkbin/conductor-exec bundle exec rubocopNot Run / Blocked
bin/conductor-exec bundle exec rspeccould not complete locally because PostgreSQL was not running at/tmp/.s.PGSQL.5432; after the first Rails load failure, Coveralls setup also reported repeated coverage initialization errors.Note
Medium Risk
Touches CI/CD deployment and production promotion workflows; mistakes could block deployments or mis-handle secrets, though the changes add additional validation and safety checks.
Overview
Replaces the hand-written Control Plane GitHub Actions with the generated
cpflow-*composite actions and workflows for review apps, staging deploys, production promotion, and nightly stale review-app cleanup.The new flow makes review apps explicitly opt-in via exact
/deploy-review-appcomments (with automatic redeploys after creation) and adds stricter safeguards around secrets/trust boundaries (config validation, fork restrictions, safer token/SSH handling, and prefix-guarded deletes). Docs and internal team notes are updated to reference the new workflows and required repo secrets/variables.Reviewed by Cursor Bugbot for commit 2c49140. Bugbot is set up for automated code reviews on this repo. Configure here.